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ABSTRACT 

ALKBH5 is a 2-oxoglutarate (20G) and ferrous iron- 
dependent nucleic acid oxygenase (NAOX) that 
catalyzes the demethylation of A/ 6 -methyladenine 
in RNA. ALKBH5 is upregulated under hypoxia and 
plays a role in spermatogenesis. We describe a 
crystal structure of human ALKBH5 (residues 66- 
292) to 2.0 A resolution. ALKBH5 66 _ 29 2 has a 
double-stranded p-helix core fold as observed in 
other 20G and iron-dependent oxygenase family 
members. The active site metal is octahedrally 
coordinated by an HXD. . H motif (comprising 
residues His204, Asp206 and His266) and three 
water molecules. ALKBH5 shares a nucleotide rec- 
ognition lid and conserved active site residues with 
other NAOXs. A large loop (piV-V) in ALKBH5 
occupies a similar region as the L1 loop of the fat 
mass and obesity-associated protein that is 
proposed to confer single-stranded RNA selectivity. 
Unexpectedly, a small molecule inhibitor, IOX3, was 
observed covalently attached to the side chain of 
Cys200 located outside of the active site. 
Modelling substrate into the active site based on 
other NAOX-nucleic acid complexes reveals 
conserved residues important for recognition and 
demethylation mechanisms. The structural insights 
will aid in the development of inhibitors selective for 
NAOXs, for use as functional probes and for thera- 
peutic benefit. 



INTRODUCTION 

Nucleic acid modifications are found in all forms of life 
and play key roles in the regulation of gene expression (1). 
More than 100 different types of post-transcriptional 
modifications have been identified in RNA (2-4); 
many with unassigned functions. Some RNA modifica- 
tions have been intensively studied, e.g. 5' RNA 
7-methylguanosine cap [m 7 G(5 / )ppp(5 / )N] in messenger 
RNA (mRNA) (5), but nonetheless, emerging data 
support the proposal that modifications to RNA consti- 
tute an important general mechanism for the regulation of 
gene expression in both healthy and diseased cells (4,6). 

The most abundant internal modification observed in 
eukaryotic mRNA is methylation of adenine to give 
A^-methyladenine (m 6 A) (7). Recent advances in high- 
throughput sequencing methods have enabled more 
detailed analysis of this modification in cells (8,9). The 
context-dependent physiological roles of m 6 A are currently 
being explored, and small molecule tools to probe the func- 
tions of this modification will be useful. The S-adenosyl 
methionine-dependent methyltransferase-like 3, METTL3/ 
MT-A70, catalyzes mRNA adenine A^-methylation (10). 
Potential m 6 A binding proteins ('readers') have been 
identified, including (embryonic lethal, abnormal vision, 
Drosophilo)-\±Q 1 (ELAVL1), YTH domain family 
member 2 (YTHDF2) and YTH domain family member 
3 (YTHDF3) (8). Two human 2-oxoglutarate (20G) and 
iron-dependent oxygenase enzymes, the fat mass and 
obesity-associated protein (FTO) and AlkB homologue 5 
(ALKBH5) have been found to catalyze m 6 A 
demethylation (11,12), indicating that RNA methylation/ 
demethylation is a dynamic modification. 
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20G and iron-dependent oxygenases are widely 
distributed in aerobic and facultative anaerobic life forms 
(13-15). 20G oxygenases use Fe(II) as a co-factor and 20G 
and molecular oxygen as co-substrates to catalyze a 
broad range of chemical reactions including epimerizations, 
cyclizations, desaturations and hydroxylations (16). 
In humans, >60 20G oxygenases have been identified, 
which have diverse cellular functions including in hypoxia 
sensing, collagen stabilization, fatty acid metabolism, RNA 
splicing and epigenetics (17). 20G oxygenases are structur- 
ally characterized by having a core double- stranded (3-helix- 
fold (DSBH), which acts as a scaffold for a conserved 
HXD/E. . .H triad of residues. Together with water mol- 
ecules and/or cosubstrates, the side chains of these 
residues octahedrally coordinate the Fe(II) cofactor. The 
20G cosubstrate binds between the major and minor 
(3-sheets of the DSBH and occupies two of the six active 
site metal coordination sites (14). 20G oxygenase sub- 
strates are recognized by various structural elements 
within and surrounding the active site (13). 

A subfamily of 20G oxygenases acts on nucleic acids 
(nucleic acid oxygenases [NAOXs]). AlkB was the first 
20G oxygenase to be characterized as an TV-methylated 
nucleic acid demethylase (18,19). In Escherichia coli (and 
other bacteria), AlkB is induced on exposure to toxic 
alkylating agents such as methyl methanesulfonate and 
enables DNA repair by catalyzing demethylation of 
1 -methyladenine (m ! A) and 3-methylcytosine (m 3 C) 
lesions (18-20). Human homologues of AlkB have been 
identified: AlkB homologues 1-8 (ALKBH1-8) and FTO 
(21-24). ALKBH1 was demonstrated to have abasic or 
apurinic/apyrimidinic (AP) lyase activity independent of 
20G and Fe(II), although no nucleic acid demethylase 
activity for it has yet to be reported (25,26). ALKBH2 
demethylates m l K and m 3 C of double-stranded DNA 
(dsDNA), while ALKBH3 selectively catalyzes 
demethylation of m l A and m 3 C in single-stranded DNA 
(ssDNA) (23,24,27,28). ALKBH8 catalyzes hydroxylation 
of 5-methoxycarbonylmethyluridine (5mcmU) on transfer 
RNA (tRNA), and the JmjC domain oxygenase TYW5 
hydroxylates tRNA wybutosine (yW) (29-32). Both 
5mcmU and yW are modifications to bases at the 
wobble position of tRNA. Other 20G oxygenases acting 
on nucleic acid substrates have been identified, including 
the ten-eleven translocation enzymes (TETs 1-3), which 
oxidize 5-methylcytosine (5mC) to sequentially form 
5-hydroxymethylcytosine (5hmC), 5-formylcytosine 
(5fC), and 5-carboxycytosine (5caC) (33-35). 

Following from pioneering structural work on NAOXs 
preferentially acting on TV-methylated DNA, i.e. AlkB and 
subsequently ALKBH2/3 (27,28,36,37,38), structures 
of NAOXs acting on RNA including those of ALKBH8, 
FTO and TYW5 have been reported (32,39,40). However, 
to date, the structures of FTO (39,41) are the only ones 
reported for a NAOX that preferentially catalyzes 
7V-demethylation of methylated mRNA. FTO is a poten- 
tial target for obesity (42,43), and various groups are 
working to develop selective inhibitors for FTO (41,44), 
both as functional probes and for target validation. 
Hence, structural studies on ALKBH5, a NAOX that, 
like FTO, catalyzes the demethylation of TV-methylated 
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Scheme 1. ALKBH5 and FTO catalyze m A demethylation. 



mRNA, are of both basic and pharmaceutical interest. 
Consequently, we undertook structural studies on 
ALKBH5 with a view to comparing it with FTO. 

It was recently reported that ALKBH5 (Uniprot entry 
Q6P6C2) is localized to the nucleus, has 20G and iron- 
dependent activity and is upregulated under hypoxic con- 
ditions by the hypoxia-inducible factor (HIF) transcrip- 
tion factor pathway (45). ALKBH5 was observed to be 
highly expressed in the lung, followed by testis, pancreas, 
spleen and ovary (46). It was also reported that ALKBH5 
localized to nuclear speckles and that decreased ALKBH5 
levels affect mRNA export and processing, leading to 
altered spermatogenesis (12). High-throughput analyses 
have provided evidence for multiple post-translational 
modifications to ALKBH5, including serine- and 
tyrosine-residue phosphorylation as well as alanine- and 
ly sine-residue acetylation (47-50). Importantly, further 
work identified m 6 A in single-stranded RNA (ssRNA) as 
a substrate for ALKBH5 (12) (Scheme 1); m 6 A in mRNA 
was also shown to be a substrate for FTO (11). 

Here we describe crystallographic studies on ALKBH5 
and compare its active site characteristics with those of 
other reported NAOX structures (27,28,32,36,39,40,51). 
The results reveal both conserved and distinctive 
features of ALKBH5 and FTO, which will be useful in 
the development of selective inhibitors. 



MATERIALS AND METHODS 

Protein expression and purification 

A plasmid with the vector backbone pNIC28-Bsa4 encoding 
a hexahistidine- tagged ALKBH5 66 _ 2 92 construct was trans- 
formed into E. coli BL21 (DE3) cells (45). The transformed 
cells were grown at 37 °C until an OD 600 of 0.6-0.8 was 
reached. ALKBH5 expression was then induced with 
0.5 mM isopropyl (3-D-l-thiogalactopyranoside (IPTG). 
Cell growth was then continued for 20 h at 18°C. The cells 
were then harvested by centrifugation (Beckman Avanti 
J-25, rotor JA10, 7000 xg, 8min), and the resultant cell 
pellets were stored at — 80 °C. The frozen cell pellets were 
thawed and resuspended in 50 mM tris(hydroxymethyl) 
aminomethane-hydrochloride (Tris-HCl) pH 7.5, 500 mM 
sodium chloride (NaCl) and lOmM imidazole and subse- 
quently lysed by sonication on ice. The ly sates were then 
centrifuged (Beckman Avanti J-25, rotor JA25.5, 43 400xg, 
20min), and the supernatant was loaded onto a 5-ml 
HisTrap HP column (GE Healthcare) and purified using 
an AKTA FPLC system. The column was washed with 
50 mM Tris-HCl pH 7.5, 500 mM NaCl and 40 mM imid- 
azole, and the protein was eluted with 50 mM Tris-HCl pH 
7.5, 500 mM NaCl and 375 mM imidazole. 
Ethylenediaminetetraacetic acid (EDTA) was added to the 



protein solution to a final EDTA concentration of 1 00 mM 
and then incubated on ice for 30min before being buffer- 
exchanged into 25 mM Tris-HCl pH 7.5, and 100 mM 
NaCl. The protein was further purified using a (5 ml) 
HiTrap Heparin column (GE Healthcare) followed by a 
(20 ml) MonoQ column; in both cases, the protein was 
eluted with a gradient from 25 mM Tris-HCl pH 7.5, to 
25 mM Tris-HCl pH 7.5 and 500 mM NaCl. The 
purified protein was then buffer-exchanged into 25 mM 
Tris-HCl pH 7.5 and concentrated to 11 mg/ml for storage 
at -80 °C. 

Crystallization 

Crystallization trials of ALKBH5 66 _ 2 92 were carried out in 
the presence of Mn(II) [as a non-reactive Fe(II) substitute] 
and various known 20G oxygenase inhibitors (52). 
ALKBH5 6 ^292 (MW 28.7 kDa) was crystallized in sitting 
drops at 293 K by the vapour diffusion method in the 
presence of Mn(II) and (l-chloro-4-hydroxyisoquinoline- 
3-carbonyl)glycine (IOX3). Crystallization drops contained 
0.2 ul of a protein solution containing a final concentration 
of 10 mg/ml hexahistidine-tagged ALKBH5 66 _ 2 92, 0.5 mM 
manganese(II) chloride and 2mM IOX3 (41,53) mixed with 
0.1 ul of well solution containing 125mM potassium nitrate 
and 15% (w/v) polyethylene glycol 3350. Crystals (size 
MOO x 50 x 50 urn) appeared after 3 months. Crystals 
were harvested using nylon loops and cryoprotected using 
well solution diluted with 25% (v/v) glycerol and flash- 
cooled in liquid nitrogen. 

Data collection and structure determination 

Data (0.2° oscillation/image, 180° total rotation) were col- 
lected at 100 K on a single crystal at Diamond synchro- 
tron beamline 124 using a wavelength of 0.97889 A and a 
Pilatus3 6M detector. All data were indexed, integrated 
and scaled using HKL3000 (54). The PHENIX (55) sub- 
routine ENSEMBLER (56) was used to generate an 
ensemble using the structures of human homologues of 
AlkB: FTO (PDB ID 4IE5), ALKBH2 (PDB ID 3S57), 
ALKBH3 (PDB ID 2IUW) and ALKBH8 (PDB ID 
3THT) to be used as a search model for phasing by mo- 
lecular replacement. The structure was then solved by mo- 
lecular replacement using PHASER (56). An initial model 
of ALKBH5 was generated by PHENIX AUTOBUILD 
(57), which included 411 residues (~97% of the final 
number of observed residues making up the two molecules 
in the asymmetric unit). Iterative cycles of model building 
and refinement were performed using COOT (58) and 
PHENIX (55) until converging R and R free no longer 
decreased. See Table 1 for detailed data collection and 
refinement statistics. 

Activity assays 

In vitro demethylation assays (12) were performed in 
triplicate in a 50|il reaction mixture containing 4uM 
ALKBH5 66 _292, varied concentrations of 5-mer ssRNA 
(10, 20, 50, 100 and 200 uM) with the sequence 5 / -GGm 
^ACU-3' (ELLA Biotech, Munich, Germany), 300 uM 
20G, 2mM L-ascorbate, 150|iM diammonium Fe(II) 
sulfate complex and 25 mM Tris-HCl, pH 7.5. The 
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Table 1. Crystallographic data collection and refinement statistics 
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reaction mixtures were incubated at room temperature, 
and 2|il of sample from each reaction was quenched 
with 2ul of 20% (v/v) formic acid at specific time 
points. One microlitre of each quenched sample was 
then mixed with 1 |il of matrix-assisted laser desorption 
ionization (MALDI) matrix made up of two parts 0.5 M 
2,4,6-trihydroxyacetophenone in ethanol and one part 
0.1 M ammonium citrate dibasic in water. The relative 
quantities of product and substrate were analysed using 
MALDI-ToF mass spectrometry (MS) (Supplementary 
Figure SI a). The Michaelis-Menten curve was fit using 
non-linear regression, and the K m of the substrate 
was estimated using GraphPad Prism (Supplementary 
Figure Sib). 

Inhibition assays were performed in triplicate for 
each inhibitor in a 25|il reaction mixture (final volume) 
containing 4|iM ALKBH5 66 . 29 2, 80|iM 5-mer ssRNA 
with the sequence 5 / -GGm 6 ACU-3 / , 150|iM 20G, 2mM 
L-ascorbate, 150|iM diammonium Fe(II) sulphate 
complex, 150|iM inhibitor [A^-oxalylglycine (NOG), 
2,4-pyridinedicarboxylate (2,4-PDCA) or IOX3] and 
25 mM Tris-HCl, pH 7.5. Controls in triplicate without 
inhibitors were also set up. Reactions were incubated at 
room temperature and quenched after 5min with an 
equivolume of 20% (v/v) formic acid and analyzed using 
MALDI-ToF MS. 

ALKBH5 Cys200-IOX3 MS 

A 0.2 liI crystallization drop mixture incubated for 
16 weeks containing an ALKBH5 crystal was first 
dissolved in 1 ul of 6M urea in lOOmM Tris-HCl, pH 
7.8, and further diluted with 30 liI of the same buffer. 
Then 1.5 ul of 200 mM dithiothreitol in lOOmM Tris- 
HCl pH 7.8, was added, and the mixture was vortexed 
and incubated for lOmin at room temperature. Six 
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microlitres of 200 mM iodoacetamide in 100 mM Tris-HCl 
pH 7.8, was added, and the mixture was vortexed and 
incubated for lOmin at room temperature. A further 6|il 
of 200 mM dithiothreitol in lOOmM Tris-HCl pH 7.8, was 
added, and the mixture was vortexed and incubated for an 
additional lOmin at room temperature. The mixture was 
then diluted with 230 |il of MilliQ-H 2 0 and vortexed. Five 
microlitres of Glu-C (2|ig/|il in lOOmM Na/K phosphate 
pH 7.2; Promega, Staphylococcus aureus V8, MS grade) 
was then added to the sample and incubated at 37°C over- 
night according to the standard procedure (59). 

The digested peptides were then purified by first 
equilibrating a C 18 Sep-Pak cartridge (Waters, 
WAT020515) with 5 ml of elution buffer [65% (v/v) aceto- 
nitrile and 0.1% (v/v) formic acid in MilliQ-H 2 0], 
followed by 10 ml of wash buffer [2% (v/v) acetonitrile 
and 0.1% (v/v) formic acid in MilliQ-H 2 0]. The sample 
was then loaded onto the column and washed with 1 0 ml 
of wash buffer. The column was eluted with 1.5 ml of 
elution buffer and collected in a 1.5 ml tube. Peptides 
were dried in a SpeedVac and resuspended in 20|il of 
wash buffer for analysis. 

Peptides were analyzed using a nanoACQUITY UPLC 
coupled to SYNAPT HDMS interfaced with a nano- 
electrospray source (Waters Corporation, Milford, MA, 
USA). Peptide digests were injected on a 5 |im symmetry 
C 18 column (180 |im x 20 mm) and washed for 1 min at 
15|ilmin _1 with 0.1% (v/v) formic acid. Peptides were 
then separated and eluted for MS analysis using a 
gradient of acetonitrile containing 0.1% (v/v) formic 
acid at 300 nl min -1 over 23 min on a nanoACQUITY 
UPLC column (BEH130 Ci 8 1.7 pm particle size (75 um 
inner diameter x 250 mm length). The column tempera- 
ture was set at 35 °C. The reference for the nanolockspray 
was set to the doubly charged peak of Glu-Fiprinopeptide 
B at a concentration of 500 fmol ul-1 flowing at 
400|ilmin _1 . The reference was constantly infused and 
sampled at 30 s intervals. 

The eluted peptides were analyzed in the positive 
ionization mode over a mass range of 50-1 990 m/z with 
a scan time of 0.6 s. The online-eluted peptides were 
analysed using an MS E method collecting MS/MS data 
using collision energy ramping from 15 to 35 eV. Spectra 
were processed using BioLynx (Waters Corporation, 
Milford, MA, USA). 

Theoretical modelling of substrate binding 

The ALKBH2-dsDNA structure (PDB ID 3BUC) and 
the AlkB-tri-nucleotide structure (PDB ID 3120) were 
used as templates for modelling binding of a 5-mer 
ssRNA (sequence: 5'-GGm 6 ACU-3') into the ALKBH5 
active site. The ALKBH2-dsDNA structure was first 
superimposed on ALKBH5. The protonation states for 
aspartate, glutamate, histidine and lysine residues were 
then assigned followed by geometric optimization of the 
positions of the hydrogen atoms by restrained energy 
minimization using the OPLS-2005 force-field (60). 
A GB/SA effective water model (61) was chosen as the 
solvent model, and non-bonded electrostatic interactions 
were truncated with a cut-off distance of 20 A. The 



convergence process was terminated after 1000 cycles 
with a 0.05kcal/mol/A gradient threshold. After energy 
minimization, QM/MM calculations (62) were performed 
to obtain the optimized geometry for the active site metal, 
His204, His266, Asp206, 20G and m 6 A-modified ssRNA 
at the DFT-M06-2X/6-31G** level (63) using a function 
consisting of a meta-hybrid of the GGA DFT function 
(64) and the M06 family function adapted for organomet- 
allic structures (65). 

RESULTS AND DISCUSSION 

ALKBH5 66 _292 has been reported as being catalytically 
active for uncoupled 20G turnover (45). Following from 
the report of a longer construct of ALKBH5 (residues 
66-394) catalyzing demethylation of m 6 A in a 1 5-mer 
ssRNA containing a S'-GGn^AClW motif (12), we con- 
firmed that our shorter ALKBH5 66 _ 2 92 construct was also 
catalytically active and determined kinetic constants using 
a 5-mer ssRNA substrate with the sequence 5 ; -GGm 
6 ACU-3' (Supplementary Figure Sla and b). We then 
carried out end point inhibition assays to investigate 
whether the enzymatic activity of ALKBH5 can be in- 
hibited by the generic 20G oxygenase inhibitors NOG, 
2,4-PDCA (66) and the more specific compound 
IOX3, which was originally developed as a prolyl 
hydroxylase (PHD) inhibitor (52). Relative to no inhibitor 
controls, all three inhibitors inhibited the activity of 
ALKBH5 6 6_292. 2,4-PDCA was the most potent of the 
three inhibitors tested (9% residual activity), followed by 
IOX3 (40% residual activity) and NOG (44% residual 
activity) (Supplementary Figure S2). In contrast, IOX3 
was shown to be a more potent inhibitor of FTO than 
both NOG and 2,4-PDCA (44). IOX3 has been in 
clinical trials as a HIF PHD inhibitor (52). 

Structure determination of ALKBH5 

Following crystallization trials with various 20G 
oxygenase inhibitors, we obtained crystals of ALKBH5 
in the presence of Mn(II) and IOX3. We then determined 
a crystal structure of ALKBH5 66 _ 2 92. The space group was 
determined to be P 2 2 X 2 U with unit cell constants 
a = 67.1 A, b =82.7 A, c = 89.2 A; a = p = y = 90°. The 
calculated Matthews coefficient (V m 2.2 A 3 ) with a solvent 
content of 43% indicated two molecules per asymmetric 
unit (Chain A and Chain B). The structure of 
ALKBH5 66 _292 was solved by molecular replacement 
using an ensemble of ALKBH2, ALKBH3, FTO and 
ALKBH8 (final translation function Z score 8.8). 
The model output from PHASER was based on FTO. 
AUTOBUILD successfully generated a model of 
ALKBH5 66 _292 with R and Rf ree values of 0.2386 and 
0.2714. The ALKBH5 66 _ 2 92 model was then improved by 
iterations of manual fitting and refinement to final R and 
R free values of 0.1646 and 0.2181, respectively. 

Overall ALKBH5 structure 

As observed in other 20G oxygenase structures (13,14), 
the conserved DSBH core fold of ALKBH5 consists of 
eight anti-parallel (3-strands (3I-VIII ((36-13), which form 
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two P-sheets: the major (3-sheet (strands P6, 8, 11 and 13) 
and the minor P-sheet (strands P7, 9, 10, 12) (Figure la). 
Three extra P-strands, (51, P2 and P3, extend the major 
P-sheet. The DSBH is flanked by three helices ocl, oc2 
and oc3 (Figure lb). The DSBH acts as the scaffold for 
the three Fe(II)-ligating residues His204, Asp206 and 
His266. Mn(II) is observed bound in the Fe(II) binding 
site, and the 20G binding pocket is positioned in a cavity 
between the two P-sheets of the DSBH with the more open 
end of the cavity apparently providing substrate access to 
the active site, a feature common to all 20G-dependent 
oxygenases (14). Electron density was observed for 219 
residues of Chain A (residues 66-68 and 145-149 were 
not observed) and 206 residues of Chain B (residues 66- 
77 and 143-151 were not observed). The structures of 
Chains A and B are similar with a calculated root-mean- 
square deviation of 0.31 A for 206 Coc atoms. The active 
site openings of both Chains A and B are buried at the 
interface between the two chains in the asymmetric unit 
(Figure 2 and Supplementary Figure S3). 

The N-terminal region of the ALKBH5 66 _ 2 92 structure 
begins with a long oc-helix (ocl, residues 70-88). Secondary 
structure prediction (69) for residues 1-66 of ALKBH5 
indicates mostly oc-helix in this region, which may form 
a helical bundle with 'ocl' or possibly a coiled coil motif. 
A crystal structure of ALKBH8 reveals an extended mixed 
oc/P N-terminal domain, termed an RNA recognition 
motif, which is believed to be involved in tRNA binding 
(40). We cannot rule out a similar role for the N-terminal 
region of ALKBH5 66 _ 2 92 (partially absent in this con- 
struct) in substrate binding. In ALKBH5, the nucleotide 
recognition lid (NRL), observed in all NAOXs, possesses 
two P hairpin-like loops (32-3 (NRL1) and (34-5 (NRL2) 
(see below) (Figure 3a). 

A loop between strands piV and PV (residues 229-242), 
which contains a 3 10 -helix (3i 0 2), extends from the DSBH 
forming an outer wall of the active site (Figure la). The 
PIV-V loop, which is known to be involved in substrate 
binding in other 20G oxygenases (13,70), is apparently 
constrained by a disulfide bond between Cys230 of the 
PIV-V loop and Cys267 of DSBH strand PVII 
(Supplementary Figure S4) and by interactions with the 
N-terminal helix ocl. Interestingly, the side chain of 
Cys227 is positioned to present a surface-exposed thiol 
and is adjacent to Cys267; these observations raise the 
possibility of a thiol/disulfide 'redox shuffle' of the 
Cys230-Cys267 disulfide to form a Cys227-Cys267 disul- 
fide and in turn relax the piV-V loop conformation 
(Supplementary Figure S3). The largest difference 
observed between Chains A and B occurs at the apex of 
the piV-V loop between residues 236 and 240. However, 
this difference may, at least in part, be a consequence of 
crystal packing due to interactions of this region with 
neighbouring molecules. 

The C-terminal regions of some NAOXs are of func- 
tional importance. The tRNA 5-methoxycarbonyl- 
methyluridine hydroxylase ALKBH8 has a Zn(II) bound 
between its DSBH and the C-terminal region of the 
oxygenase domain that leads to a C-terminal 
methyltransferase domain. The helical bundle C-terminal 
domain of FTO has been shown to be required for FTO 



activity (39). Our studies show that the C-terminal region 
of ALKBH5 (residues 293-394) is not essential for m 6 A 
demethylase activity at least when using 5-mer ssRNA 
in vitro (Supplementary Figure SI a). The C-terminus of 
ALKBH5 (residues 293-394) is largely predicted to be dis- 
ordered (69) and contains numerous proline, arginine and 
serine residues. This is notable because Arg-Ser-rich 
regions of SR proteins are involved in RNA binding 
(71). The multiple serine residues (Ser361, Ser371, 
Ser374, Ser384) flagged as potential phosphorylation 
sites in ALKBH5 have the potential to regulate RNA 
interactions. 

Modification of Cys200 was observed in electron 
density maps. The shape of the electron density suggested 
that it was derived from the ALKBH5 inhibitor, IOX3, 
used in the crystallization conditions. Subsequent refine- 
ment led to assignment of the modification as arising from 
the reaction of IOX3 with the Cys200 side chain with the 
loss of chlorine (Figure 2a). Two molecules of the cova- 
lently bound IOX3 derivative, each attached to different 
ALKBH5 molecules at Cys200, are observed to be in 
proximity (<3.5A) with their aromatic bicyclic rings 
stacked against one another, presumably aiding crystal 
packing and formation (Figure 2b). Fragmentation MS 
analysis (MS/MS) of a solubilized ALKBH5 crystal 
supports the proposal that the modification occurs at 
Cys200 as observed in the crystallographic analyses 
(Supplementary Figure S5). 

Although structures of IOX3 have been reported in 
complex with other 20G oxygenases, including PHD2 
and FTO (41,52), its attachment by covalent reaction, as 
observed with ALKBH5, is unique. In the structural 
complexes of IOX3 with PHD2 and FTO, IOX3 binds 
non-covalently to the active site metal in a bidentate 
manner (41,72,73) (Supplementary Figure S6a). This 
binding mode is not possible for the Cys200-linked 
IOX3 in the ALKBH5 structure because it is too far 
from the active site metal and its position is restrained 
(Supplementary Figures S6b and S7). It thus seems most 
likely that the reaction of IOX3 with Cys200 occurred 
during the prolonged crystallization process (up to 12 
weeks) via a nucleophilic aromatic substitution reaction 
(Supplementary Figure S8). Although the overall results 
indicate that the crystallographically observed alkylation 
of Cys200 by IOX3 is likely not relevant to the inhibition 
of ALKBH5 at the time scale of our inhibition studies in 
solution (5min), the observed reaction does raise the 
question as to whether S-arylation via nucleophilic 
aromatic substitution may occur during clinical applica- 
tion of CI -chlorinated isoquinoline derivatives. 
Nucleophilic aromatic substitution of amino acids and 
residues occurs during derivitization by the Sanger 
reagent, and was reported for the irreversible inhibition 
of the thyroid hormone receptor TRP by methylsulfonyl- 
nitrobenzoates (74,75). Detailed inhibition studies of 
ALKBH5 will be reported in due course. 

Active site 

In addition to the conserved HXD. . .H motif (His204, 
Asp206, His 266) and arginine (Arg277) from DSBH 
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strand PVIII required for 20G binding (see below), a com- 
bination of hydrophobic and polar residues line the 20G 
binding cavity (Ile281, Val279, Tyrl95, Ile268, Leu226, 
Ile201, Asnl93 and Vall91). The His206Ala ALKBH5 
variant was found to be inactive, consistent with an essen- 
tial role for His206 in catalysis; activity of the His266Ala 
variant was substantially impaired (12) in agreement with 
prior studies that show two metal ligating residues can be 
sufficient for catalytic activity in other 20G oxygenases 
(76). The next shell of residues lining the opening to the 
20G binding site includes Arg283, Aspl25, Argl30, 
Tyrl41, Phel34, Lysl32, Tyrl39, Glul53 and Cys200 
(Figures lc and 2a). 

Studies with other 20G oxygenases show that differ- 
ences in the 20G binding pocket can be exploited for 



the development of selective inhibitors (52,77). The 20G 
binding pocket of ALKBH5 includes residues Arg277 and 
Tyrl95 (Figures lc and 4a), which likely interact with 
the C5-carboxylate of 20G as deduced by comparison 
of the ALKBH5 structure with those for NAOX-20G/ 
NOG complexes (27,28,36,39,40) (Figure 4b-f). 
Interestingly, while FTO possesses the conserved 
arginine involved in 20G C5-carboxylate binding as well 
as a tyrosine residue, the 20G binding tyrosine (Tyr295) 
in FTO comes from DSBH (3VI (Figure 4c), differentiating 
it from ALKBH5 and the other AlkB homologues where a 
20G binding tyrosine is derived from DSBH (31. 

All of the AlkB-like NAOXs, including ALKBH5, 
also have a serine residue that interacts with the 20G 
C5-carboxylate binding arginine (Figure 4a-f). This 




Figure 1. (a) Ribbons representation of the ALKBH5 66 _ 2 92 structure showing the active site metal Mn(II) (purple sphere) substituting for Fe(II); 
residues His204, Asp206 and His266 (white sticks); NRL1 and NRL2 (pink fonts); disordered residues (dashed line); the DSBH P-strands I-VIII 
(yellow); other P-strands (salmon); and a- and 3i 0 -helices (blue), (b) Topology of the structure of ALKBH5 66 _ 2 92- P-strands are shown as triang- 
les, a-helices as large circles and 3 10 -helices as small circles. The piV-V loop is highlighted in purple, the DSBH in yellow and the NRLs in salmon, 
(c) Active site residues of ALKBH5 66 _ 2 92 with representative electron density (3.0a mF Q -DF c OMIT; green mesh) for side chains of His204, Asp 206, 
His266 and water molecules (red spheres) all coordinated (black dashed lines) to the active site Mn(II) ion (purple sphere) (d) A ClustalW2 (67) 
sequence alignment of ALKBH5 homologues from various organisms [from top: Homo sapiens: human (sequence identity 100%, PDB ID 4NJ4, 
Uniprot ID Q6P6C2); Mus musculus: mouse (sequence identity 97%, Uniprot ID Q3TSG4); Gallus gallus: chicken (sequence identity 86%, F1NIA5); 
Xenopus laevis: frog (sequence identity 78%, Uniprot ID Q6GPB5); Danio rerio: zebrafish (sequence identity 72%, Uniprot ID Q08BA6); Strigamia 
maritima: centipede (sequence identity 56%, Uniprot ID T1JJ71); and Strongylocentrotus purpuratus: purple sea urchin (sequence identity 52%, 
Uniprot ID H3I4D7)]; combined with structure-based sequence alignment (68) with ALKBH3 (PDB ID 2IUW); ALKBH2 (PDB ID 3BUC); 
ALKBH8 (PDB ID 3THT); FTO (PDB ID 3LFM); and AlkB (PDB ID 3I3Q). A few selected manual adjustments were made to the alignment 
to correct for likely automated errors. Note: Variant residues from the reported ALKBH2 (PDB ID 3BUC) structure sequence were changed to 
reflect the wild-type ALKBH2 sequence (Uniprot Q6NS38). Residues highlighted as conserved (dark blue); semi-conserved (light blue); weakly 
conserved (grey); conserved 20G oxygenase catalytic triad HXD. . .H, red; conserved 20G binding arginine (green). Boxed residues indicate those 
forming NRL1 (red); NRL2 (blue); the piV-V loop of ALKBH5 (purple); the LI loop of FTO (black). Secondary structural elements of H. sapiens 
ALKBH5 are represented as light blue sinusoidal waves (a-helices), red arrows (P-strands excluding the DSBH core), yellow arrows (P-strands of the 
DSBH core) and single light blue arcs (3i 0 -helix). (e) Schematic domain representations of human NAOXs for which structures have been reported. 
DSBH, double-stranded P-helix core domain; RRM, RNA recognition motif; MT, methyltransferase domain; and CTD, C-terminal domain. 

(continued) 
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Figure 1. Continued. 

serine is not present in the wybutosine tRNA hydroxylase 
TYW5, the 20G binding pocket architecture of which is 
considerably different from those in the AlkB subfamily 
(27,28,32,36,39,40). Rather than belonging to the AlkB- 



like NAOX subfamily of 20G oxygenases, TYW5 
belongs to the JmjC oxygenase subfamily (31,32), most 
of which act as histone A^-methyl lysine demethylases 
and lysyl or arginyl hydroxylases. Most of the JmjC 
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Figure 2. Binding of l-chloro-4-hydroxyisoquinoline-3-carbonyl)glycine (IOX3) to ALKBH5 involves covalent attachment (Figure S8). (a) The small 
molecule IOX3 (cyan sticks) reacts and forms a covalent bond with the side chain of Cys200 (white sticks); the electron density map (3.0a mF G -DF c 
OMIT; green mesh) is shown, (b) Two protein molecules in an asymmetric unit of an ALKBH5 66 _ 2 92 crystal. Covalently attached IOX3 molecules 
from each protein molecule stack against each other via n-n interactions. 



oxygenases, including TYW5, use a lysine (derived from 
DSBH (3IV) and a tyrosine (derived from a (3-strand 
involved in extending the major (3-sheet positioned anti- 
parallel to DSBH (31) for 20G C5-carboxylate binding 
(Figure 4g). In contrast to the differences observed in 
20G C5-carboxylate binding between the NAOX and 
JmjC enzymes, they all use a conserved binding mechan- 
ism for 20G Cl-carboxylate binding. An asparagine, 
present in the active site of ALKBH5 (Asnl93), interacts 
with the Cl-carboxylate oxygen of 20G (Figure 4a-d and 
f) and is conserved not only in other NAOXs, but also in 
many other 20G oxygenases including some JmjC domain 
oxygenases [i.e. JMJD2A (78)]. 

The active site of ALKBH5 appears more open than 
that of FTO. This may be due to a combination of the 
longer NRL in FTO as well as its C-terminal domain, 
both of which act to enclose the active site (Figure 3b). 
Whether this apparent difference is reflective of differences 
in m 6 A RNA substrates accepted by either FTO or 
ALKBH5 is unknown. It may be that ALKBH5 accepts 
bulkier RNA secondary structure in its active site, al- 
though preliminary results with a stem loop-derived 
RNA substrate suggest otherwise (12). 

Both m 6 A demethylases, ALKBH5 and FTO, have a 
basic residue (Lysl32 and Arg96, respectively) (Figure 4a 
and c) in the active site, which might be involved in sub- 
strate recognition/selection and/or product release follow- 
ing demethylation. Interestingly, it was shown that Lysl32 
can be acetylated from proteomics studies using deacetylase 
inhibitors in MV4-11 cells, a human acute myeloid leukae- 
mia cell line (79); acetylation of Lysl32 likely affects 
activity. The Arg96Met and Arg96Trp FTO variants 
almost completely abolish FTO enzymatic activity (39), 
supporting a crucial role for the conserved basic residue 
at this position in the m 6 A demethylases. 

Arg283 is adjacent to the Fe(II) binding site in 
ALKBH5. An arginine residue at this position is 
conserved in all other structurally characterized NAOXs 



(27,28,36,39,40) (Figure 4a-f), and in many other 20G 
oxygenases including deacetoxycephalosporin-C synthase 
and the related oxidase isopenicillin N synthase (80,81). 
This conserved arginine is proposed to be involved in 
oxygen activation (14,16), and its substitution has been 
shown to abrogate activity in the case of ALKBH3 (28). 
Furthermore, there is an acidic residue usually positioned 
near the active site in NAOXs (Glul75 in ALKBH2, 
Glul95 in ALKBH3, Glu234 in FTO and Aspl35 
in AlkB), which is present in ALKBH5 as Glul53 
(Figure 4a-e). In the case of FTO, Glu234 has been 
shown to be important in substrate recognition. The 
Glu234Pro variant in FTO abolishes enzymatic activity 
(39). The AlkB Aspl35Ala variant abolishes activity to- 
wards m ! A in ssDNA, however, this variant increases 
activity towards rr^G (82). 

Substrate recognition/selection elements 

All structurally characterized NAOXs (AlkB, ALKBH2, 
ALKBH3, ALKBH8 and FTO) possess a conserved NRL 
(Figures la and 3a-f). The observed NRL (residues 
124-161) of ALKBH5 comprises mixed (3 and loop sec- 
ondary structure forming two [3-hairpin-like loops: NRL1 
(residues 124-137, strands (32—3) and NRL2 (residues 
138-161, strands (34-5). In ALKBH5, NRL1 extends the 
major (3-sheet of the DSBH and forms a short type I (3 
turn (residues 127-130). NRL2 is partially disordered at 
the apex (residues 145-149) and is sandwiched between 
DSBH strand (311 and the C-terminus (Figure 3a). 
Interestingly, the majority of the NRL sequence is 
observed to be disordered in the ALKBH8 crystal struc- 
ture (Figure 3f), potentially indicating a requirement for 
flexibility in substrate binding (40). NRL1 is shorter in 
ALKBH5 than for the other NAOXs (AlkB, ALKBH2, 
ALKBH3 and FTO) (Figure 3a). The conserved 
ALKBH5 residue Argl30 in NRL1 (Figure Id) 
may interact with the substrate phosphate backbone 
as was observed for the equivalent residue in the 
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ALKBH5 



Figure 3. Comparison of NAOX structures reveals differences in their nucleotide recognition lids. NRL1 (red) and NRL2 (blue); (a) ALKBH5 (PDB 
ID 4NJ4), (b) FTO (PDB ID 3LFM), (c) ALKBH2 in complex with dsDNA (PDB ID 3BUC), (d) ALKBH3 (PDB ID 2IUW) and (e) AlkB in 
complex with dsDNA (PDB ID 3BIE). (f) The NRL sequences of ALKBH8 (PDB ID 3THT) are mostly disordered, (g) TYW5 (PDB ID 3AL5) 
from the JmjC oxygenase subfamily; note the different structural elements for tRNA substrate recognition; potential substrate contact regions are 
coloured brown, (h) Superimposition of ALKBH5 (light orange ribbon), FTO (grey ribbon) and ALKBH2 (not shown) in complex with double- 
stranded DNA (green). The (3IV-V loop (purple) of ALKBH5 and LI loop (black) of FTO overlap with the 'unrepaired' DNA strand (light green), 
potentially conferring single-strand selectivity for ALKBH5 and FTO. 
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Figure 4. Comparison of the active site of ALKBH5 with those of other nucleic acid oxygenases. Active site residues of (a) ALKBH5 (white sticks), 
(b) ALKBH2 (light blue sticks), (c) FTO (grey sticks), (d) ALKBH3 (green sticks), (e) ALKBH8 (pink sticks), (f) AlkB (cyan sticks) and (g) TYW5 
(teal sticks) (PDB ID 3AL6). Oxygen (red), nitrogen (blue), phosphorous (orange), sulphur (yellow), n^A base carbon (yellow), Mn(II) (purple 
sphere), Fe(II) (orange sphere), Ni(II) (green sphere) and water molecule (red sphere) and electrostatic interaction (black dashed line) are indicated. 
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Figure 5. (a and c) ALKBH5 electrostatic surface representation (basic in blue; acidic in red) with 90° rotation along the X-axis and (b and d) 
corresponding ribbon representation. The substrate-binding groove around the active site is largely basic for the binding of the negatively charged 
ssRNA phosphate backbone. The basic region between the (3IV-V loop and a 1 -helix forms a potential substrate binding groove. 



ALKBH2:dsDNA complex (ArgllO); the ALKBH5 
Argl30 equivalent Argl31Ala variant in ALKBH3 was 
shown to be inactive (28). The sequence of the disordered 
apex of ALKBH5 NRL2 contains two basic residues, 
Lysl47 and Argl48, both of which are conserved across 
ALKBH5 homolo gues in various organisms (Figure Id) 
and may be important for substrate recognition through 
interactions with the RNA substrate phosphate backbone. 

ALKBH5, AlkB, FTO and ALKBH2 share a YXY/F 
motif positioned on the first strand of NRL2 ((34 in 
ALKBH5, ALKBH2 and FTO and on the equivalent 
of the ALKBH5 'anti-parallel' strand [35 in AlkB) 
(Figure Id). In the ALKBH2-dsDNA complex structure 
(PDB ID 3BUC), the hydroxyl of the proximal Tyrl22 
interacts with the N 6 atom of the m ! A substrate (3.2 A) 
(Figure 4b) (27). Although the artificial covalent substrate 
attachment used to obtain the ALKBH2-DNA complex 
structure might slightly alter its position in the active site, 
the observed distance suggests a potential role for Tyrl22 
in catalysis. In ALKBH2 and FTO, the distal Phel24 or 



Tyrl08 Ti-stacks directly with the base, and substitution of 
Tyrl08 in FTO abolishes the activity (39). This 'base 
stacking' role is partially replaced by Trp69 in AlkB, 
aided by the proximal Tyr76 (taking the place of the 
distal Tyrl41 in ALKBH5 due to its position on the 
anti-parallel strand (35) (Figure 4f), which makes inter- 
actions with Trp69 and the base as well as the phosphate 
backbone. The position of the NRL2 varies in the differ- 
ent NAOXs, possibly conferring specificity towards 
modified-base type and/or sequence context. 

The (3IV-V loop of ALKBH5 contains basic residues 
(Lys231, Lys235 and Arg238) and overlaps with the comple- 
mentary 'unrepaired' strand for dsDNA when superimposed 
with ALKBH2-dsDNA structure, potentially conferring 
ssRNA selectivity (Figure 3h). In ALKBH5, the (3IV-V 
loop includes a 3i 0 -helix that is absent in the other 
ALKBHs. The 3i 0 -helix contains a solvent-exposed phenyl- 
alanine, Phe234, which might act as a phenylalanine finger to 
flip the m 6 A base into the active site by inserting between 
flanking bases (see below). The piV-V loop, along with 
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the a 1 -helix, appears to form a positively charged groove, 
which could be important for binding the negatively charged 
ssRNA phosphate backbone (Figure 5a-d). A similar 
ssRNA selectivity role has been proposed for a loop in 
FTO, the LI loop, which is located between DSBH 
strands I and II (39). When ALKBH5 and FTO are 
superimposed, the LI loop of FTO is situated in the same 
relative position near the active site as the (3IV-V loop of 
ALKBH5 (Figure 3h). AlkB also has a (3IV-V loop, 
although not as long as that observed in ALKBH5. In 
AlkB, Argl61 is located at the apex of the (3IV-V loop. 
The AlkB Argl61Ala variant shows a decrease in affinity 
for methylated DNA, but its rate of activity was not 
affected. Thus, AlkB Argl61 is believed to play a role in 
the recognition of damaged bases (82). 

Model of substrate binding 

Based on the structural information available for NAOX 
substrate complexes, we manually docked and energy- 
minimized two different modes of substrate binding to 
ALKBH5. Because ALKBH5 prefers ssRNA as a sub- 
strate, we modelled the consensus sequence 5 ; -GGm 
6 ACU-3' obtained by m 6 A-seq (8,9). The 5'-3' direction 
of the strand through the active site was kept, consistent 
with that observed for the DNA in complex with AlkB 
and ALKBH2 (27,51). Two base-flipping modes, one with 
the flanking bases directly stacking against each other (as 
observed for ssDNA bound to AlkB) (51) and one with 
the insertion of a phenylalanine finger, Phe234, between 
the flanking bases (as observed for ALKBH2) (27). We 
restrained the docking to preserve the predicted inter- 
action between Argl30 and the phosphate backbone 
(as observed in the ALKBH2-dsDNA complex) (27) and 
set the m 6 A substrate A^-methylgroup-metal distance to 
~4.1 A, which is the average 'substrate carbon to enzyme 
metal distance' observed for most 20G oxygenase- 
substrate structures (13). Tyrl41 was placed in a 
position to interact with the phosphate backbone as for 
the AlkB-dsDNA complex (51). After manually adjusting 
for the two known base-flipping mechanisms, the models 
were energy-minimized to correct for geometry and 
steric interactions. The preliminary results suggest that 
the phenylalanine finger base-flipping mode involving 
Phe234 from the (3IV-V loop intercalating the bases 
flanking m 6 A (Supplementary Figure S9) is more likely 
than the direct base-stacking mode. Although Phe234 
appears relatively distant from the substrate binding 
groove, its position could change on substrate binding in 
an induced-fit mechanism aided by reduction of the 
Cys230-Cys267 disulfide (Supplementary Figure S4). 

CONCLUSIONS 

The discovery that RNA 7V-methylation is reversible 
has opened up new avenues both for developing an under- 
standing of the regulation of gene expression and for its 
medicinal exploitation. To date, the only oxygenases 
reported to remove m 6 A in RNA are ALKBH5 and 
FTO. Both enzymes appear to be physiologically import- 
ant: FTO is associated with obesity, whereas ALKBH5 is 



involved in fertility (12,43). Although ALKBH5 reveals 
conserved structural elements with FTO, including a 
similar general active site chemistry and the use of an 
NRL, there are also clear differences. Work with other 
20G oxygenases (52) suggests these differences can be ex- 
ploited in the development of selective compounds that 
can be used to test the validity of ALKBH5 and FTO as 
medicinal chemistry targets. 
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